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AMENDMENTS TO THE CLAIMS 

.P/ease awercd ^/je Claims as follows: 

1. (Original) A computer-based method fe- to perform query optimization by automatically 
finding and exploiting hidden, fuzzy algebraic constraints in a database, said method comprising 
the steps of: 

(a) constructing one or more candidates of form C=(aj, a?, P, ©), wherein ay and a 2 are 
numerical attributes associated with column values of data in said database, P is a pairing rule, 
and © is any of the following algebraic operators: +, -, x, or /; 

(b) constructing, for each candidate identified in (a), an algebraic constraint AC=(fl;, a 2 , 
P, ©,//,..., Ik) by applying any of, or a combination of th e following techniques to a sample of 
eekaafl vnliinn; statistical histogramming, a segmentation , or clust e ring technique , where h, 

I k is a set of disjoint intervals and k > I, said step of constructing algebraic constraint further 
comprising the steps of: 

constructing a sample set Wr of an induced set Q r , wherein P is a join 

predicate between tables R and S and Q c = {r.a t @ r.a 2 : r e R) when the pairing 

rule P is a trivial rule 0r and 

Q c = {r.a, © s.a 2 :r eR,s e S,and (r,s) satisfies ?}; 

sorting n data points in said sampled set W r in increasing order as x,<x?< 

... <x,n and constructing a set of disjoint intervals h h such that data in sample 

W r falls within one of said disjoint intervals, wherein segmentation for 
constructing said set of disjoint intervals is specified via a vector of indices (HI), 
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i(2) iflc)) and the / th interval is given by ^^/x j^^+z.X/ ^/ and length of Ij 

denoted by Li is given by L ,. = x i{j) -x i{J ; and 

wherein the function for optimizing cost associated with said segmentation is 
c(s) = wk + (l - w) ^ ^ J with w being a fixed weight between 0 and 1 and a 

segmentation that minimizes c is defined by placing adjacent points x , and Xii in the 
same segment if and only if xj +i^ < <i* where J* = ACw/Cl-w)), and 
wherein said constructed algebraic constraints are used in query optimization. 

2. (Original) A compute-based method as per claim 1, wherein one or more pruning rules are 
used to limit said number of constructed candidates. 

3. (Original) A computer-based method as per claim 2, wherein said pairing rule P represents 
either a trivial pairing rule 0r or a join between tables R and S and said pruning rules comprise 
any of, or a combination of the following: 

pairing rule P is of form R.a = S.b or of the form 0 R , and the number of rows in either 
table R or table S lies below a specified threshold value; 

pairing rule P is of form R.a = S.b with ae K and the number of distinct values in S.b 
divided by the number of values in R.a lies below a specified threshold value, wherein K is a set 
comprising key-like columns among all columns in said database; 

pairing rule P is of form R.a = S.b, and one or both of R and S fails to have an index on 
any of its columns; or 

pairing rule P is of form R.a = S.b with a e K , and S.b is a system-generated key. 
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4. (Original) A computer-based method as per claim 1, wherein said method further comprises 
the steps of: 

identifying a set of useful algebraic constraints via one or more pruning rules; and 
partitioning data into compliant data and exception data. 

5. (Original) A computer-based method as per claim 4, wherein said method further comprises 
the steps of: 

receiving a query; 

modifying said query to incorporate identified constraints; and 

combining results of modified query executed on data in said database and said original 
query executed on exception data. 

6. (Original) A computer-based method as per claim 4, wherein said partitioning is done by 
incrementally maintained materialized views, partial indices, or physical partitioning of the table. 

7. (Original) A computer-based method as per claim 2, wherein said pruning rules comprise 
any of, or a combination of the following: 

ai and «2 are not comparable data types; 

the fraction of NULL values in either a\ or a2 exceeds a specified threshold; or 
either column a 1 or a 2 is not indexed. 

8. (Original) A computer-based method as per claim 1, wherein said step of constructing one or 
more candidates further comprises the steps of: 

generating a set P of pairing rules; and 



Page 4 of 12 



Docket: ARC92003004 4US1 
Application: 10/697,052 

for each pairing rule PeP, systematically considering possible attribute pairs (a\, 02) and 
operators © with which to construct candidates. 

9. (Original) A computer-based method as per claim 8, wherein said step of generating a set P 
of pairing rules further comprises the steps of: 

initializing P to be an empty set; 

adding a trivial pairing rule of the form o R to said set P for each table R in said database; 

and 

generating and adding nontrivial pairing rules to said set P based upon identifying 
matching columns via an inclusion dependency, wherein a column b is considered a match for 
column a if: 

data in columns a and b are of a comparable type; or 

either (i) column a is a declared primary key and column b is a declared foreign 
key for the primary key, or (ii) every data value in a sample from column b has a 
matching value in column a. 

10. (Original) A computer-based method as per claim 8, wherein said step of generating a set P 
of pairing rules further comprises the steps of: 

initializing P to be an empty set; 

adding a trivial pairing rule of the form 0r to said set P for each table R in said database; 

and 

generating a set K of key-like columns from among all columns in said database with 
each column in set K belonging to a predefined set of types T, said set K comprising declared 
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primary key columns, declared unique key columns, and undeclared key columns, wherein said 
primary keys or declared unique keys are compound keys of form a = (aj, ...,a„) e 7™ for m>l; 

adding nontrivial pairing rules to said set P based upon identifying matching compound 
columns via an inclusion dependency wherein, given a compound key (aj, ...,a^eK, a compound 
column b is considered a component wise match for compound column a if: 

data in compound columns a and b are of a comparable type; or 
either (i) compound column a is a declared primary key and compound column b 
is a declared foreign key for the primary key, or (ii) every data value in a sample from 
compound column b has a matching value in compound column a. 

11. (Cancelled) 

12. (Currently Amended) A computer-based method as per claim 11 claim 1 , wherein widths 
associated with said intervals are expanded to avoid additional sampling required to increase 
right end point to equal maximum value in Q C \ 

13. (Currently Amended) A computer-based method as per claim 11 claim 1 , wherein size of said 
sampled set is approximated via the following iterative steps: 

(a) given a £- segmentation, setting counters z'=l and k=l; 

(b) selecting a sample size n=n*, wherein n(k)» 1 p + — , wherein p is the 

probability that at least a fraction of points in Qc that lie outside the intervals is at most/; 

(c) obtaining a sample based on (b), computing algebraic constraints, and identifying a 
number k of bump intervals; and 
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(d) if n>n*(k') or i = i max , then utilizing sample size in (b); else setting counters k=k' and 
i=i+l, and returning to step (b). 

14. (Cancelled). 

15. (Cancelled). 

16. (Original) A computer-based method as per claim 1, wherein said method is implemented 
across networks. 

17. (Original) A computer-based method as per claim 16, wherein said across networks element 
comprises any of, or a combination of the following: local area network (LAN), wide area 
network (WAN), or the Internet. 

18. (Cancelled). 

19. (Cancelled). 

20. (Cancelled). 

21. (Cancelled). 

22. (Currently Amended) An article of manufacture comprising a computer usable medium 
having computer readable program code embodied therein which implements a method to 
perform query optimization by f&r-automatically finding and exploiting hidden, fuzzy algebraic 
constraints in a database, said method comprising the steps of: 
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(a) computer readable program code constructing one or more candidates of form C=(a/, 
a 2 , P, ©), wherein a; and are numerical attributes associated with column values of data in 
said database, P is a pairing rule, and © is any of the following algebraic operators: +, -, x, or /; 

(b) computer readable program code constructing, for each candidate identified in (a), an 
algebraic constraint AC=(a/, 02, P, ©, h,..., Ik) by applying any of, or a combination of the 
following t e chniques to a sample of column values: statistical histogramming, ^segmentation 



technique , or clustering , where //, 4 is a set of disjoint intervals and k > 1, said step of 
constructing algebraic constraint further comprising the steps of: 



constructing a sample set W q of an induced set Q r , wherein P is a join 
predicate between tables R and S and Q c - {r.a x © r.a 2 : r e R] when the pairing 

rule P is a trivial rule 0r and 

Q c = {r.a x © s.a 2 :r e R,s e S,and (r,s) satisfies P}; 

sorting n data points in said sampled set W c in increasing order as x l <kz< 
... <x„ and constructing a set of disjoint intervals / /, . . ., h such that data in sample 
W r falls within one of said disjoint intervals, wherein segmentation for 
constructing said set of disjoint intervals is specified via a vector of indices (id), 

i(2) iOcTS and the 7 th interval is given by Ii=rx l f l .im .Xim J and length of Ij, 

denoted by Li, is given by L, = x tu) - x i{J _ l)+1 : and 
wherein the function for optimizing cost associated with said segmentation is 
1 , 



c{s) = wk + (l-w) 



with w being a fixed weight between 0 and 1 and a segmentation 
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that minimizes c is defined by placing adjacent points x ,_ and xj+ i in the same segment if and only 
if xu±-x i < d*, where J* = Afw/fl-w)), and 

wherein said constructed algebraic constraints are used in query optimization. 

23. (Original) An article of manufacture as per claim 22, wherein said medium further 
comprises: 

computer readable program code identifying a set of useful algebraic constraints via 
heuristics comprising a set of pruning rules; and 

computer readable program code partitioning data into compliant data and exception data. 

24. (Original) An article of manufacture as per claim 23, wherein said medium further 
comprises: 

computer readable program code aiding in receiving a query; 

computer readable program code modifying said query to incorporate identified 
constraints; and 

computer readable program code combining results of modified query executed on data in 
said database and said original query executed on exception data. 

Please cancel claims 25-38. 
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